Self-supervised representation learning follows a paradigm of withholding some part of the data and tasking the network to predict it from the remaining part. Towards this end, masking has emerged as a generic and powerful tool where content is withheld along the sequential dimension, e.g., spatial in images, temporal in audio, and syntactic in language. In this paper, we explore the orthogonal channel dimension for generic data augmentation. The data for each channel is quantized through a non-uniform quantizer, with the quantized value sampled randomly within randomly sampled quantization bins. From another perspective, quantization is analogous to channel-wise masking, as it removes the information within each bin, but preserves the information across bins. We apply the randomized quantization in conjunction with sequential augmentations on self-supervised contrastive models. This generic approach achieves results on par with modality-specific augmentation on vision tasks, and state-of-the-art results on 3D point clouds as well as on audio. We also demonstrate this method to be applicable for augmenting intermediate embeddings in a deep neural network on the comprehensive DABS benchmark which is comprised of various data modalities. Code is availabel at http://www.github.com/microsoft/random_quantize.
translated by 谷歌翻译
The application of natural language processing (NLP) to cancer pathology reports has been focused on detecting cancer cases, largely ignoring precancerous cases. Improving the characterization of precancerous adenomas assists in developing diagnostic tests for early cancer detection and prevention, especially for colorectal cancer (CRC). Here we developed transformer-based deep neural network NLP models to perform the CRC phenotyping, with the goal of extracting precancerous lesion attributes and distinguishing cancer and precancerous cases. We achieved 0.914 macro-F1 scores for classifying patients into negative, non-advanced adenoma, advanced adenoma and CRC. We further improved the performance to 0.923 using an ensemble of classifiers for cancer status classification and lesion size named entity recognition (NER). Our results demonstrated the potential of using NLP to leverage real-world health record data to facilitate the development of diagnostic tests for early cancer prevention.
translated by 谷歌翻译
Large language models (LLMs) show excellent performance but are compute- and memory-intensive. Quantization can reduce memory and accelerate inference. However, for LLMs beyond 100 billion parameters, existing methods cannot maintain accuracy or do not run efficiently on hardware. We propose SmoothQuant, a training-free, accuracy-preserving, and general-purpose post-training quantization (PTQ) solution to enable 8-bit weight, 8-bit activation (W8A8) quantization for LLMs that can be implemented efficiently. We observe that systematic outliers appear at fixed activation channels. Based on the fact that weights are easy to quantize while activations are not, SmoothQuant smooths the activation outliers by offline migrating the quantization difficulty from activations to weights with a mathematically equivalent transformation. SmoothQuant enables an INT8 quantization of both weights and activations for all the GEMMs in LLMs, including OPT-175B, BLOOM-176B, and GLM-130B. SmoothQuant has better hardware efficiency than existing techniques using mixed-precision activation quantization or weight-only quantization. We demonstrate up to 1.56x speedup and 2x memory reduction for LLMs with negligible loss in accuracy. Thanks to the hardware-friendly design, we integrate SmoothQuant into FasterTransformer, a state-of-the-art LLM serving framework, and achieve faster inference speed with half the number of GPUs compared to FP16. Our work offers a turn-key solution that reduces hardware costs and democratizes LLMs. Code is available at: https://github.com/mit-han-lab/smoothquant.
translated by 谷歌翻译
Deep learning methods have contributed substantially to the rapid advancement of medical image segmentation, the quality of which relies on the suitable design of loss functions. Popular loss functions, including the cross-entropy and dice losses, often fall short of boundary detection, thereby limiting high-resolution downstream applications such as automated diagnoses and procedures. We developed a novel loss function that is tailored to reflect the boundary information to enhance the boundary detection. As the contrast between segmentation and background regions along the classification boundary naturally induces heterogeneity over the pixels, we propose the piece-wise two-sample t-test augmented (PTA) loss that is infused with the statistical test for such heterogeneity. We demonstrate the improved boundary detection power of the PTA loss compared to benchmark losses without a t-test component.
translated by 谷歌翻译
Spiking neural networks (SNNs) are promising brain-inspired energy-efficient models. Recent progress in training methods has enabled successful deep SNNs on large-scale tasks with low latency. Particularly, backpropagation through time (BPTT) with surrogate gradients (SG) is popularly used to achieve high performance in a very small number of time steps. However, it is at the cost of large memory consumption for training, lack of theoretical clarity for optimization, and inconsistency with the online property of biological learning and rules on neuromorphic hardware. Other works connect spike representations of SNNs with equivalent artificial neural network formulation and train SNNs by gradients from equivalent mappings to ensure descent directions. But they fail to achieve low latency and are also not online. In this work, we propose online training through time (OTTT) for SNNs, which is derived from BPTT to enable forward-in-time learning by tracking presynaptic activities and leveraging instantaneous loss and gradients. Meanwhile, we theoretically analyze and prove that gradients of OTTT can provide a similar descent direction for optimization as gradients based on spike representations under both feedforward and recurrent conditions. OTTT only requires constant training memory costs agnostic to time steps, avoiding the significant memory costs of BPTT for GPU training. Furthermore, the update rule of OTTT is in the form of three-factor Hebbian learning, which could pave a path for online on-chip learning. With OTTT, it is the first time that two mainstream supervised SNN training methods, BPTT with SG and spike representation-based training, are connected, and meanwhile in a biologically plausible form. Experiments on CIFAR-10, CIFAR-100, ImageNet, and CIFAR10-DVS demonstrate the superior performance of our method on large-scale static and neuromorphic datasets in small time steps.
translated by 谷歌翻译
知识蒸馏是将知识从强大的教师转移到有效的学生模型的有效方法。理想情况下,我们希望老师越好,学生越好。但是,这种期望并不总是成真。通常,由于教师和学生之间的不可忽略的差距,更好的教师模型通过蒸馏导致不良学生。为了弥合差距,我们提出了一种渐进式蒸馏方法,以进行致密检索。产品由教师渐进式蒸馏和数据进行渐进的蒸馏组成,以逐步改善学生。我们对五个广泛使用的基准,MARCO通道,TREC Passage 19,TREC文档19,MARCO文档和自然问题进行了广泛的实验,其中POD在蒸馏方法中实现了密集检索的最新方法。代码和模型将发布。
translated by 谷歌翻译
作为最成功的AI驱动应用程序之一,推荐系统的目的是通过在我们生活的许多方面提供个性化建议,以有效而有效的方式帮助人们做出适当的决定,尤其是针对各种面向人类的在线服务,例如E-商务平台和社交媒体网站。在过去的几十年中,推荐系统的快速发展通过创造经济价值,节省时间和精力以及促进社会利益,从而使人类受益匪浅。但是,最近的研究发现,数据驱动的推荐系统可能会对用户和社会构成严重威胁,例如传播虚假新闻以操纵社交媒体网站中的公众舆论,扩大不公平为代表性不足的团体或在工作匹配服务中的个人,或从建议结果中推断隐私信息。因此,系统的可信赖性一直吸引着各个方面的关注,以减轻推荐系统引起的负面影响,以增强公众对推荐系统技术的信任。在这项调查中,我们提供了可信赖的推荐系统(TREC)的全面概述,特别关注六个最重要的方面;即安全与鲁棒性,非歧视与公平,解释性,隐私,环境福祉以及问责制和可审计性。对于每个方面,我们总结了最近的相关技术,并讨论了潜在的研究方向,以帮助未来实现值得信赖的推荐系统。
translated by 谷歌翻译
审议是人类日常生活中的一种共同自然行为。例如,在撰写论文或文章时,我们通常会首先编写草稿,然后迭代地擦亮它们,直到满足为止。鉴于这种人类的认知过程,我们提出了Decom,这是自动评论生成的多通审议框架。 DECOM由多个审议模型和一个评估模型组成。给定代码段,我们首先从代码中提取关键字,然后从预定义的语料库中检索类似的代码片段。然后,我们将检索到的代码的评论视为初始草案,并将其用代码和关键字输入到DETOM中,以开始迭代审议过程。在每次审议时,审议模型都会抛光草案并产生新的评论。评估模型衡量了新生成的评论的质量,以确定是否结束迭代过程。终止迭代过程后,将选择最佳的评论作为目标评论。我们的方法在Java(87K)和Python(108K)的两个现实世界数据集上进行了评估,实验结果表明,我们的方法表现优于最先进的基准。人类评估研究还证实,DECOM产生的评论往往更可读性,信息性和有用。
translated by 谷歌翻译
自动语音识别模型需要大量的语音数据进行培训,并且此类数据的收集通常会导致隐私问题。联合学习已被广泛使用,被认为是一种有效的分散技术,通过协作学习共享的预测模型,同时将数据保留在不同客户端设备上。但是,客户设备上有限的计算和通信资源给大型模型带来了实际困难。为了克服此类挑战,我们建议联合修剪以在联合环境下训练还原模型,同时与完整模型相比保持相似的性能。此外,与集中式培训相比,还可以利用大量客户数据来改善修剪结果。我们探索不同的修剪方案,并提供了我们方法有效性的经验证据。
translated by 谷歌翻译
这里介绍了人工智能研究所(IARAI)组织的2022年Landslide4sense(L4S)竞赛的科学结果。竞争的目的是根据全球收集的卫星图像的大规模多个来源自动检测滑坡。 2022 L4S旨在促进有关使用卫星图像的语义分割任务的深度学习模型(DL)模型最新发展的跨学科研究。在过去的几年中,由于卷积神经网络(CNN)的发展,基于DL的模型已经达到了对图像解释的期望。本文的主要目的是介绍本次比赛中介绍的细节和表现最佳的算法。获胜的解决方案详细介绍了Swin Transformer,Segformer和U-NET等最先进的模型。还考虑了先进的机器学习技术和诸如硬采矿,自我培训和混合数据增强之类的策略。此外,我们描述了L4S基准数据集,以促进进一步的比较,并在线报告准确性评估的结果。可以在\ textIt {未来开发排行榜上访问数据,以供将来评估,\ url {https://www.iarai.ac.ac.at/landslide4sense/challenge/},并邀请研究人员提交更多预测结果,评估准确性在他们的方法中,将它们与其他用户的方法进行比较,理想情况下,改善了本文报告的滑坡检测结果。
translated by 谷歌翻译